Skip to content

UPSTREAM PR #18718: ggml: new backend for Virglrenderer API Remoting acceleration (v2)#867

Open
loci-dev wants to merge 36 commits intomainfrom
upstream-PR18718-branch_kpouget-upstream
Open

UPSTREAM PR #18718: ggml: new backend for Virglrenderer API Remoting acceleration (v2)#867
loci-dev wants to merge 36 commits intomainfrom
upstream-PR18718-branch_kpouget-upstream

Conversation

@loci-dev
Copy link

@loci-dev loci-dev commented Jan 9, 2026

Mirrored from ggml-org/llama.cpp#18718

This is a follow up of ggml-org/llama.cpp#17072

The API Remoting backend/frontend allow escaping the VM isolation, with the help of the virt-gpu paravirtualization (and the virglrenderer library on the host side).

  • ggml-remotingfrontend is a GGML API implementation, which intercepts the GGML API calls and forwards them to the virt-gpu virtual device
  • ggml-remotingbackend is library loaded by virglrenderer (PR will be opened soon for discussion), which opens a GGML library and forwards the call received from virglrenderer.

Here is the context behind this PR:

image

See the Virglrenderer PR which enables the API Remoting trampoline required in Virglrenderer:
https://gitlab.freedesktop.org/virgl/virglrenderer/-/merge_requests/1590

  • this work focused on MacOS, where the in VM/container inference performance are tight to the remoting stack
  • the code works on Linux. I didn't evaluate thoroughly the performance.

@loci-review
Copy link

loci-review bot commented Jan 9, 2026

Explore the complete analysis inside the Version Insights

Perfect! I've retrieved the summary report for your project. Here are the key findings:

Summary Report for llama.cpp PR #867

Project Information:

Performance Analysis Results:

No Significant Performance Regressions Detected

The analysis compared the base version (1bdd7101-ed4c-11f0-a055-c529586b3e1a) with the target version (ce032a91-ed61-11f0-a055-c529586b3e1a) and found:

  • No modified functions showed performance changes greater than 2% for:
    • Response Time (execution time per function call)
    • Throughput Time (time spent in function including callees)

Conclusion:

This pull request maintains performance stability and is safe to merge from a performance perspective. All changes are within the acceptable 2% variance threshold, indicating no meaningful performance degradation has been introduced.

@loci-dev loci-dev force-pushed the main branch 2 times, most recently from 6c70c0f to 7921e68 Compare January 9, 2026 17:11
@loci-dev loci-dev force-pushed the upstream-PR18718-branch_kpouget-upstream branch from a6ed565 to 9017716 Compare January 9, 2026 17:39
@loci-review
Copy link

loci-review bot commented Jan 9, 2026

Explore the complete analysis inside the Version Insights

Perfect! I've generated the summary report for your project. Here's what the analysis shows:

Summary Report for llama.cpp PR #867

The performance analysis indicates that Pull Request #867 has minimal to no performance impact on the llama.cpp repository:

Key Highlights:

No significant response time changes (all changes < 2%)
No significant throughput changes (all changes < 2%)
Performance-neutral - the changes maintain existing performance characteristics

Recommendation:

The PR can proceed without any performance concerns. No optimization work or further performance investigation is needed.

The comparison was made between:

  • Base version: b5662d81-ed7e-11f0-a055-c529586b3e1a
  • Target version: c1a9e141-ed83-11f0-a055-c529586b3e1a

This is a positive outcome showing that the code changes don't introduce any performance regressions.

@loci-dev loci-dev force-pushed the main branch 22 times, most recently from 4d62814 to 8ddc8e3 Compare January 13, 2026 09:13
@noamshiovits-Aurora
Copy link

@loci-dev when was the latest analysis performed?

@loci-review
Copy link

loci-review bot commented Jan 28, 2026

I don't have access to information about when the latest analysis was performed. The performance review report that I have access to doesn't include a timestamp indicating when the analysis was run.

If you need this information, you might want to check:

  • The metadata or header of the performance review report document
  • The system or platform that generated the analysis
  • Any accompanying logs or audit trails from the analysis tool

Is there something specific about the performance analysis results that I can help you understand instead?

@noamshiovits-Aurora
Copy link

@loci-dev was the last analysis successful ?

@loci-review
Copy link

loci-review bot commented Jan 28, 2026

I don't have access to information about whether the last analysis was successful or not. I can see and work with the performance review report that's been provided in our conversation, but I don't have visibility into:

  • The execution status of the analysis job
  • Whether it completed successfully or with errors
  • Any logs or status messages from the analysis process
  • System health or diagnostic information

If a performance review report has been shared in this conversation, that would suggest the analysis completed and produced results. However, I can't confirm the success status definitively.

If you're concerned about the analysis status, you may want to:

  • Check the analysis platform or dashboard directly
  • Review any execution logs or status notifications
  • Verify with your system administrator or the tool that ran the analysis

Is there something specific in the performance analysis results you'd like me to help you understand or investigate?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants